Clustering Using Rough - Set Feature Selection

نویسندگان

  • Usman Qamar
  • John A. Keane
چکیده

`Feature selection aims to remove features unnecessary to the target concept. Rough-set theory (RST) eliminates unimportant or irrelevant features, thus generating a smaller (than the original) set of attributes with the same, or close to, classificatory power. Clustering, also a form of data grouping, groups a set of data such that intra-cluster similarity is maximized and inter-cluster similarity is minimized. As with classification, there exists a group of attributes or features on the basis of which clustering is carried out; hence RST may be used for clustering. This paper analyses the effects of rough sets on clustering using 10 datasets, each including a decision attribute. This generates a framework for applying rough-sets for clustering purposes. Rough-sets are then used for knowledge discovery in clustering and the conclusion indicated a very significant result that removal of individual numeric attributes has far more effect on clustering accuracy than removal of categorical attributes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts

High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...

متن کامل

Feature Selection Based on the Rough Set Theory and EM Clustering Algorithm

We study the Rough Set theory as a method of feature selection based on tolerant classes that extends the existing equivalent classes. The determination of initial tolerant classes is a challenging and important task for accurate feature selection and classification. In this paper the EM clustering algorithm is applied to determine similar objects. This method generates fewer features with eith...

متن کامل

Feature Selection Based on the Rough Set Theory and Expectation-Maximization Clustering Algorithm

We study the Rough Set theory as a method of feature selection based on tolerant classes that extends the existing equivalent classes. The determination of initial tolerant classes is a challenging and important task for accurate feature selection and classification. In this paper the Expectation-Maximization clustering algorithm is applied to determine similar objects. This method generates fe...

متن کامل

Soft Set Based Feature Selection Approach for Lung Cancer Images

Lung cancer is the deadliest type of cancer for both men and women. Feature selection plays a vital role in cancer classification. This paper investigates the feature selection process in Computed Tomographic (CT) lung cancer images using soft set theory. We propose a new soft set based unsupervised feature selection algorithm. Nineteen features are extracted from the segmented lung images usin...

متن کامل

Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets

With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012